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1. INTRODUCTION 

Recently, logo classification [1]-[4] has an significant direction in computer vision applications due 
to the vital role of this application to save time and effort. Therefore, an automated document classification 
becomes urgent requirement make it easier to search for particular documents. In the last years, there has 
been a lot of interest in document image processing and comprehension for a range of applications such as 
digital repositories, internet publishing and surfing, online shopping, and official automation systems. 
The identification of logos is a reliable method for document image analysis and retrieval. Logos used to 
mark the source of a text. Logos are 2D shapes with a variety of types that are usually a mix of graphical and 
text elements [5]. Logo detection and recognition belong to two fields in computer science (computer vision 
and pattern recognition), where the logo recognition considered as a special case of image recognition [6]-[8]. 
Usually the logo consists of mixed texts and graphic symbols according to its design. Therefore, it considered 
as a very difficult task to discover, especially when it iffers from the trained logo in terms of its size, rotation, 
resolution, lighting, colors, and many more. 

In this work, the main contribution is proposing a new convolutional neural network (CNN) 
architecture to recognize and classify several types of logos with different characteristics. This model is 
robust to changes in the rotation, scales and lighting conditions. Moreover, the model applied on different 
logos with different characteristics. The results show the proposed model has an acceptable and reliable 
outcome on logo classification. 

The organization of this paper as following: in section 2, literature review is explained. Section 3 
show the proposed framework that contain the dataset that used in the experiments and proposed CNN 
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model, and a comparison among recent work on logo classification. Finally, the results will be conduct in 
section 4 follow by section 5 to draw the conclusion. 
— Literature review 

Logo recognition has a main role in different areas such as security, advertisements, and classify 
different bodies document using their own logs. Therefore, many studies have been introduced with different 
methods to tackle this issue. Some of the methods based on hand graft features and others using deep 
learning algorithms. Some of these studies have been explained below. 

Hassanzadeh and Pourghassem [9], a method based on handgraft features, where a new spatial 
feature extraction for logo recognition proposed using k-nearest neighbor (KNN) to detection and recognition 
a novel logo. This feature was identified based on histogram of object occurrence in a new tessellation of 
logo image. The experiment results emphasized effectiveness of the proposed algorithm in noisy and 
separated part logos. 

Handcraft features are used to classify the logos [7]-[13]. One of these methods were presented 
in [7]-[11], where histogram of oriented gradients (HOG) and scale-invariant feature transform (SIFT) have 
been applied to printed image for recognizing logo. As a consequence of the experiment, HOG outperforms 
SIFT significantly, where the HOG approach finds logos with a precision of 54% and a recall of 29%. While, 
the SIFT method only obtains 21% precision and 14% recall [10] and 93.50% precision and 77.94% recall, 
and 85.02% F1 in [11] respectively. 

Llorca et al. [12], Sulehria and Zhang [13] introduced method to classify vehicle logo. The former one 
presented schema named “LogoSENSE” that consist HOG applied on logos with fixed size, with a max-margin 
loss equipped support vector machine (SVM) to decrease number of false positives. And in [13] the work 
based on mathematical morphology as a shape descriptor to classify the logo. Handcrafts features are weak 
for the noise and image distortion. 

Ozay and Sankur [14] an automatic TV logo classification system by static regions given by 
time-averaged edges subjected to post-processing operations. Once the region of interest of a logo candidate 
is identified, TV logos are classified via their subspace features. Comparative analysis of features has 
justified that ICA-II architecture yields the most discriminative with an accuracy rate of 99.2% in a dataset of 
3040 logo images (152 varieties). Online tests for both detection and classification on running videos have 
achieved 96.0% average accuracy. A more reliable logo identifier will be feasible by improving the accuracy 
rate of the extracted logo mask. 

Kumar et al. [15] proposed a method to classify the color logos by extracted the general features 
(color, shape, texture) of logos and merge these features in different ways for classification as either a logo 
with only text or a logo with only symbols or a logo with both text and symbol. The K-NN classifier is used 
for classification. Further, the system is categorized in the logo if the logo image consists of a text only or 
symbols only, or some image has both the text and symbols at the same time. 

The KNN method is used in the classification. It employs the dataset called UoMLogo. This dataset 
is generally divided into three classes. They are both logo image (a mix of text and symbol), text logo image 
and symbol image. The outcomes show that the accuracy in text image, image, and blend text and image is 
42.06, 43.58 and 48.98 respectively [15]. 

Multi-level context-guided classification method with object-based convolutional neural networks 
(MLCG-OCNN) proposed in [16], this model consists of an object-level contextual guided object-based CNN 
and is applied to carry out per-object classification by using image segmentation and merging the high-level 
features of spectral patterns, geometric characteristics, and contextual information. Then, with the help of the 
conditional random field (CRF), the per-object classification result is further refined by means of the pixel-level 
contextual guidance. The results showed the method achieves remarkable classification performance (> 80%). 
Compared with the state-of-the-art architecture DeepLabV3+, the MLCG-OCNN method demonstrates high 
computational efficiency for very high resolution imagery (VHRI) classification (4—5 times faster). 

Patalappa and Chandramouli [17] worked on dataset of 450 TV broadcast channel logos (Indian 
channels) like (sports, movies, kids and cartoon, and entertainment) through different data augmentation 
techniques to expand the logo corpus for classifying logo using deep learning (YOLO v2). Su et al. [18], 
proposed a multi-perspective cross-class (MPCC) domain adaptation method to classify a fraction of logo 
classes whilst the remaining classes are only annotated with a clean icon image. The experiment results in 
extensive comparative experiments show the advantage of MPCC over existing state-of-the-art competitors 
on the challenging Queen Mary University of London (QMUL)-OpenLogo dataset benchmark. 

Oliveira et al. [19] presented an automatic graphic logo detection system for FlickrLogos-32 dataset 
that robustly handles unconstrained imaging conditions based on fast region-based convolutional networks 
(FRCN), two CNN models pre- trained with the ImageNet large scale visual recognition challenge (ILSVRC) 
ImageNet dataset have been used. The experimental results achieved a top recognition F1-score of 0.909 with a 
base learning rate of 0.001 at 30000 iterations and with a threshold of 0.4. A method had been proposed in [20] 
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for logo recognition using deep learning. Logo recognition is essential in numerous application domains [21]. 
The study carried the experiments on two datasets: FlickrLogos-32 indicates 32 distinctive logo brands and 
the Logos-32plus dataset that contains more elements than FlickrLogos-32. Consequently, the Logos-32plus 
is better than FlickrLogos-32. Also, deep learning has been applied on 2000 logos from 295K pictures 
gathered from Amazon in [21] based on using deep learning networks. 

Su et al. [8] the new technique to incremental learning called Scalable logo self-co-learning (SL2) 
described is capable of autonomously self-discovering noisy web imagery. Furthermore, using a big 
(2,190,757 pictures of 194 logo classes) logo dataset called “WebLogo-2M” by a programmed web 
information assortment and handling technique, the evaluations demonstrate the superiority of the proposed 
SL2 method over the state-of-the-art. Karimi and Behrad [22] enhanced the discrimination of the logo by 
employing some strategies of deep convolutional neural networks (DCNNs). Firstly, the combination of the 
features extraction and classification is employed by pre-trained deep models and SVM classifiers. Secondly, 
the logo recognition is adjusted by present pre-trained deep models. Finally, fine-tuned DCNNs outputs are 
merged by a voting algorithm in parallel structures. 

Tiizko et al. [23] suggested to use an open set logo retrieval method that is better than closed set 
logo retrieval approaches because it has a number of stages that are detected, compared, and retrieved the 
logo from the images that are based on the convolutional neural networks (CNNs). The detection stage is 
faster region-based convolutional neural network (R-CNN) detected and extracted the object features from 
the image and then classified them. After that, the comparison stage is taking the extracted logo features from 
the detection stage and compares them with query samples in the database by using cosine similarity. 

Hou et al. [24], the authors used the merge of the popular methods for logo classification. Firstly, 
the fine-tuning CNN architectures are produced four deep representations. These deep representations are 
merged with a number of imitative classifiers for running the logo classification. The proposed method is 
applied on the building Logo-405 dataset. 

Bianco et al. [25] used different methods to characterize the images. Firstly, the recognition pipeline 
is taking the input image and extracted object proposals regions. Transformation pursuit is applied on the 
images for warping to a common size, increase the training data set and produce an expanded query. Finally, 
a CNN are applied for extracting the features and a SVM are used for recognizing and classifying the logo. 

Tandola et al. [26] utilized three examples of DCNN architectures that are GoogLeNet-GP, 
GoogLeNet-FullClassify and Full-Inception to solve the variety of logo resolutions. So, the logos have been 
classified by applying these architectures. Finally, the logos are detected with their location by using the 
features of raw images and proposed region. Based on the results of the previous studies deep learning give 
an acceptable and reliable results comparing to classical methods. As a result, we propose a new CNN model 
to recognize logo; next sections show the details of this model. 


2. PROPOSED FRAMEWORK 

The framework of the logo recognition based on using CNN model. The deep learning prove the 
effectiveness in the recognition applications. Therefore, CNN model was suggested to address logo 
recognition issue. The model contains multi layers as explained in Table 1 with 2D convolutions and many 
max pooling layers. In the next sections the proposed method will be explained in details. 


2.1. Dataset 

In this paper, logos were collected manually for 25 ministries and establishments in Iraq for each of 
them 10 logos with various sizes, color, resolutions, lighting. To explain more on this collected datsetset, 
Figure | shows a sample of our dataset. First row example on higher ministry, second row represent college 
of science at University of Kerbala and so on. 


2.2. Proposed CNN architecture 

The proposed method based on using CNN model to classify different logos. These logs have 
different condition variables such as rotation, scales and different backgrounds. The proposed CNN model as 
shown in Figure 2 contains of 3 layers of (2D) convolutional with 3 layers of Max_pooling. The Table 1 
gives a summary about the suggested CNN architecture. 


2.3. Preprocessing step 

In this step the whole dataset are resized to 224x224 to prepare to the input layer. In the following 
layers the features will be extracted. To avoid overfitting the rectified linear unit (ReLU) activation function 
was used with data augmentation. Followed by max pooling layers to extract the informative features from 
the raw data. Max pooling based on selecting the max value in the selected window. Figure 3 explains as an 
example of the max pooling. 
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Figure 1. Samples of the dataset 


Because of the few numbers of samples, data augmentation has been used to increase the number of 
the dataset by using different transformation (scale and rotation). The augmentation gives a reliable number of 
samples for each logo. This method can be used when the dataset is few. By using this transformation, the size 
of dataset become reliable as an input to the CNN. The CNN is known that it needs a huge data to train as 
a result the data augmentation is the suitable solution of this issue when the size of the dataset is small. 
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The overview performance for the proposed model and it is an exact implementation of the 
conceptual model that used while learning how to measure the number of learnable parameters in a CNN. 
From Table 1, there were 2432 learnable parameters in the first convolutional layer, we also determined that the 
second convolutional layer had 25632 learnable parameters also the third layer had 25632 and the output layer 
had 82976 parameters, for 136,672 learnable parameters in the entire network. Table 2 depicts a comparison 
among the different method applied on this area “logo recognition”. From the table, the outcomes of the 
proposed method give a motivation to consider the CNN to recognize the logos. 
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Figure 2. Proposed CNN structure for logo classification 


Figure 3. Max pooling 


Table 1. Model “sequential_1” 

Layer (type) Output Shape Param # 
Conv2d_3 (Conv2D) (None, 96, 96, 32) 2432 
Activation_4 (Activation) (None, 96, 96, 32) 0 
Max_pooling2d_3 (MaxPooling2) (None, 48, 48, 32) 0 
Conv2d_4 (Conv2D) (None, 44, 44, 32) 25632 
Activation_5 (Activation) (None, 44, 44, 32) 0 
Max_pooling2d_4 (MaxPooling2) (None, 22, 22, 32) 0 

( 

( 

( 

( 

( 

( 


Conv2d_5 (Conv2D) None, 18, 18, 32) 25632 
Activation_6 (Activation) None, 18, 18, 32) 0 
Max_pooling2d_5 (MaxPooling2) None, 9, 9, 32) 0 
Flatten_1 (Flatten) None, 2592) 0 
Dense_1 (Dense) None, 32) 82976 
Activation_7 (Activation) None, 32) 0 


Total params: 136,672 
Trainable params: 136,672 
Non-trainable params: 0 


Official logo recognition based on multilayer convolutional neural network model (Zahraa Najm Abdullah) 


1088 O ISSN: 1693-6930 
Tabel 2. Comparsion of different method applied on logo recognition 
Paper a PA F l 
itatis Application Dataset size Pre-processing Result and approach 
[1] Applied segmentation and the — The logos are After eight test models, 
spatial density for detecting the translated, scaled, the detection rate is 
logo orientated, and 94.74% 
degraded 
[5] — Compared between HOG and — A local news agency Rotated and inclined — SIFT method 
SIFT methods in logo contains images with from an image before HOG achieved 20.6% 
detection 10 companies’ logos detection precision and 14% 
— The HOG is better than SIFT (32-93 images for each recall 
after HOG perform logo) — HOG method 
transformation image (resizing achieved 33.7% 
and rotation) precision and 39.5% 
recall 
[6] Applied HOG histogram for — 3060 web page training -= — 93.50% precision 
visual representations of target — 1979 unique snapshots — 77.94% recall score 
brand logos and used SVM — Fl-scores 85.02% 
classifier 
[7] Identify TV logo by using 3040 logo images database — — 99.2% accuracy of 
— Time-average edges for logo logo images 
detection — 96.0% average 
— ICA2 features for logo accuracy of running 
classification video 
[9] Classify VHRI by MLCG-OCNN — 6000x6000 pixels of each Resize operation of the — classification 
proposed method in two level the potsdam images object performance > 80% 
— Object-level is evidenced per — 2817x2557 pixels of the from traditional 
object classification largest Vaihingen images method 
— Pixel-level is refined the — 4-5 times faster in 
classification result computational 
efficiency for VHRI 
classification 
[12] Recognized the brand by using — ILSVRC (1000 categories Used horizontally Used mAP to obtain the 
graphic logo detection system and 1.2 million images) flipping the training results at 60000 
and FRCN with transfer learning — FlickrLogos-32 (32 images for the data iterations 
different brand logos, 70 augmentation 
images per class/brand 
logo and 6000 non-logo 
images) 
[14] Recognize the logo by two — Product logo (PL2K) of -= — 97% recall -with 0.6 
modes: 2000 logos from 295K mAP on PL2K 
— Universal logo detector to images — 0.565 mAP on 
learn the characteristics of a — FlickrLogos-32 of 32 logos FlickrLogos-32 
logo and find the regions of a from 8K images 
logo 
— Logo recognizer to classify the 
logo by nearest neighbor and 
triplet-loss with proxies 
[17] Open set approach, searched and — 871 brands — Average precision 
retrieved a large-scale unseen — 11,054 logo images (0.368—0.464) 
logo and a new domains by an 
one query sample based on CNN 
Our CNN Model Collected dataset from google — 99.16% 
proposed 
method 


3. EVALUATION THE PROPOSED MODEL 
In the experiments, dataset split into 80%-20% train/test respectively. The accuracy rate for the 
dataset is 99.16%. The cross validation was achieved on this dataset to improve the outcomes. The model 
success to recognize multi official logo with different characteristics (different scale and texture). 
The optimal epoch that used in the experiments is 20 which prove the significant results to classify the logos. 
Trying different epochs in the training phase was applied but as an optimal number that give an acceptable. 


4. CONCLUSION 
In this paper, a new CNN model has proposed to recognize the various types of logos. CNN has a 
vital role in classification and recognition problems. Based on that, the proposed work employs CNN to 
recognize the logos. These logos represent an official logo of Iraq government ministries. The findings 
present that the suggested model has an effective role to recognize and classify these logs effectively. 
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The accuracy rate of the framework is 99.16%. This percentage make the model an acceptable and reliable. 
As a future work the system can be developed to manipulate the logo to distinguish between real and fake 
logos. 
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